Understanding Complex Visually Referring Utterances
نویسندگان
چکیده
We propose a computational model of visually-grounded spatial language understanding, based on a study of how people verbally describe objects in visual scenes. We describe our implementation of word level visually-grounded semantics and their embedding in a compositional parsing framework. The implemented system selects the correct referents in response to a broad range of referring expressions for a large percentage of test cases. In an analysis of the system’s successes and failures we reveal how visual context influences the semantics of utterances and propose future extensions to the model that take such context into account.
منابع مشابه
Grounded Semantic Composition for Visual Scenes Grounded Semantic Composition for Visual Scenes
We present a visually-grounded language understanding model based on a study of how people verbally describe objects in scenes. The emphasis of the model is on the combination of individual word meanings to produce meanings for complex referring expressions. The model has been implemented, and it is able to understand a broad range of spatial referring expressions. We describe our implementatio...
متن کاملGrounded Semantic Composition for Visual Scenes DRAFT - Do not cite
We present a study on how people verbally describe objects in visual scenes. The emphasis of our analysis lies on the combination of individual word meanings to produce meanings for complex referring expressions. Based on this study, we propose a computational model of visually-grounded spatial language understanding. We have implemented the model, and it is able to understand a broad range of ...
متن کاملGrounded Semantic Composition for Visual Scenes
We present a visually-grounded language understanding model based on a study of how people verbally describe objects in scenes. The emphasis of the model is on the combination of individual word meanings to produce meanings for complex referring expressions. The model has been implemented, and it is able to understand a broad range of spatial referring expressions. We describe our implementatio...
متن کاملCoordinating Understanding and Generation in an Abductive Approach to Interpretation
We use a dynamic, context-sensitive approach to abductive interpretation to describe coordinated processes of understanding, generation and accommodation in dialogue. The agent updates the dialogue uniformly for its own and its interlocutors’ utterances, by accommodating a new context, inferred abductively, in which utterance content is both true and prominent. The generator plans natural and c...
متن کاملEvaluation of the Scusi? Spoken Language Interpretation System - A Case Study
We present a performance evaluation framework for Spoken Language Understanding (SLU) modules, focusing on three elements: (1) characterization of spoken utterances, (2) experimental design, and (3) quantitative evaluation metrics. We then describe the application of our framework to Scusi?— our SLU system that focuses on referring expressions.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003